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Abstract 
We  consider  here  the  problem  of  Byzantine  Agreement  among  n 
synchronous  processors.  At  most  t  <  n/10  processors  may  be  faulty.  We 
provide  a  randomized  algorithm  which  takes  O(log  n)  rounds  and  uses 
t.  nl+0(.log(n/(n-t ) ) )  messages  and  signatures.  Note  that  this  bound  is 
nt  if  t  =  o(n).  Our  algorithm  does  not  require  the  use  of  an  external 
global  random  bit  and  instead  it  only  makes  use  of  local  random 
choices.  Our  algorithm  makes  an  error  with  probability  at  most  n~c 
where  c  >  1  is  a  constant  which  can  be  set  arbitrarily  large. 
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1.  Introduction 
The  Byzantine  Agreement  (BA)  problem  is  essentially  the  problem  of 
finding  a  protocol  for  reaching  agreement  among  n  distributed  processes 
of  which  at  most  t  may  be  faulty.  BA  was  defined  by  Pease,  Shostak  and 
Lamport  [PSL,  80]  and  is  essential  for  maintaining  coordination  and 
synchronization  among  processes  in  distributed  systems.  [FLP,  82] 
showed  that  the  BA  problem  has  no  deterministic  solution  in  the  case 
the  processes  are  asynchronous.  [DS,  81]  showed  that  t+1  rounds  are 
necessary  for  synchronous  processes  and  deterministic  potocols.  Ben-Or 
[BO,  83]  gave  a  randomized  solution  to.  the  asynchronous  BA  problem. 
However,   his   algorithm  needs  aji  j  exponential  number  of  messages  and 

rounds  for  t  =  n'*'^'*6  for  any- e  ^.-Q^  but  requires  a  constant   number 

1/9 
of   rounds  and  a  polynomial  number- of  messages  in  the  case  t  =  o(nw  ). 

Rabin  [R,  83],  introduced  the  assumption  of  a   single   random  bit   per 

round   (which   all   processes   can   read)   to  solve  the  asynchronous  BA 

problem,  with  no  error,  in  an  expected  constant  number  of  rounds  and  by 

using   0(n  )   messages.    He   further   observed  that  this  message  bound 

could  be  reduced  to  0(nt)  messages.   Rabin  also  showed   how   to   modify 

his  algorithm  to  take  only  a  fixed  number  R  of  rounds  with  reliability 

l-2-R  and  by  using   the   same   number   of   messages.    Apparently,   the 

"pre-dealt"   nature  of  the  random  sequence  of  bits  required  for  Rabin's 

algorithm  is  crucial. 

Rabin  posed  as  an  open  problem  finding  a  protocol  to  distributedly 

decide  on  a  global  random  bit.   Broder  and  Dolev  in  [BD,  84]  have  shown 

that  a  protocol  for  finding  a  global  random  bit  in  the  synchronous  case 

requires   t+1  rounds  in  the  worst  case  (which  can  be  achieved  anyway  by 

a  deterministic   solution).    [B,84]   has   shown   how   to   construct   a 
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"global"  random  bit  by  distributed  randomization,  in  O(log  n)  expected 
number  of  rounds.  However,  his  construction  is  nonuniform  in  the  sense 
that  each  processor  must  follow  a  protocol  possibly  distinct  from  other 
processors'  protocols,  which  has  to  be  precomputed  and  is  apparently 
not  easily  computable  on  line. 

Our  contribution  is  the  discovery  of  a  randomized  algorithm  for 
synchronous  Byzantine  Agreement.  It  only  makes  use  of  authentication 
and  only  local  random  choice,  but  it  does  not  assume  a  global  or 
external  random  bit.  Our  algorithm  achieves"  BA  in  O(log  n)  rounds,  by 
using  tn1+0l  **  n-F  messages  and  signainrres  and  by  making  an  error 
with  probability  at  most  n-c,  where  c  >"  1  rsfnd  can  be  set  arbitrarily 
large.  Note  that  this  message  and  signature' bound  is  nt  if  t  =  o(n). 
We  assume  a  two-valued  initial  message,  Dutfl£3ftT  can  be  easily  extended 
to  the  multivalued  case,  by  known  techniqiie^'." 

2.    Definitions  and  assumptions 
Let   P,,...,P   be   processors   (the   "generals").   We  assume  that 
every  P.  can  directly  exchange  messages  with  every  other  P.  ,   through 
some  reliable  means  of  communication.   In  the  beginning  of  the  system's 

execution,  another  process  Pq  distinct  from  Pj Pfl   (the   sender  or 

"commander")  sends  a  message  M.  to  each  P.  .  M^  is  called  the  initial 
message,  and  its  value,  v  e  { 0, 1} .  Let  v  be  the  complement  of  v.  Each 
?!  is  supplied  with  the  same  program,  AP  (called  the  Agreement 
Protocol ).  As  long  as  a  P.  computes  according  to  AP,  it  is  called 
proper.  Once  a  processor  P.  deviates  from  AP  it  is  faulty  and  is 
considered  to  remain  faulty  even  if,   later   on,   it   reverts   back   to 
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following  AP.   We  assume  that  the  set   of   processors   that   eventually 
become  faulty  is  independent  of  the  random  choices  of  our  algorithm. 
Byzantine  Agreement  is  achieved  when 

(1)  All  proper  processors  agree  on  the  same  value,  and 

(2)  if   the   sender  Pq  sends   the   same  value  to  everybody  (operates 
correctly)  then  all  proper  processors  agree  on  its  value. 

Our  analysis  of  the  problem  is  based  on  the  worst  case  assumption 
that  faulty  processors  are  not  predictable  and  may  be  malicious.  An 
algorithm  should  sustain  any  strange  behavior  of  faulty  processors, 
even  a  collusion  to   prevent   the  proper  processors   from  reaching 

agreement. 

.  ri  z  i  c 

In  our  algorithm  we  employ  authentication  of  messages  by  digital 
signatures.  A  faulty  processor  can  invent  any  unauthenticated 
information  and  furthermore  faulty  processors  can  cooperate  in  the 
production  of  false  messages.  However,  all  processors  share  a 
signature  scheme  that  enables  each  one  to  sign  its  message  so  that 
every  receiver  will  recognize  who  sent  the  message  and  no  one  can  alter 
the  content  of  the  message  or  the  signature  undetectably.  (Such  a 
scheme  was  suggested  by  [RSA,  78]).  We  assume  that  every  message  that 
contains  only  signatures  of  faulty  processors  can  be  produced  by  these 
processors.  We  shall  denote  by  a.(M)  the  message  M  authenticated 
(signed)  by  P.  .  Finally,  we  assume  that  each  processor  Pj  has  a 
random  number  generator,  and  can  choose  random  numbers  independently  of 
other  processors.  We  assume  that  the  faulty  processors  cannot  predict 
or  affect  the  result  of  these  choices  of  random  numbers  by  proper 
processors. 

Note  that  our  assumptions  about  authenticated  messages  mean  (as  in 
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[D,82])  that  no  processor  can  alter  an  authenticated  message  received 
at  a  previous  round  and  forward  it  as  an  authenticated  message  at  the 
next  round,  nor  can  any  processor  pretend  to  have  received  an 
authenticated  message  it  did  not  receive  and  forward  that  as  an 
authenticated  message.  In  the  rest  of  the  paper,  we  assume  that  t  <_ 
n/10,  where  t  is  the  maximum  number  of  faulty  processors. 

Definition.   Let   a  path,   II   of   length  m  be  a  sequence  of  m  distinct 

processor  names. 

i 

Definition.   Let  n  =  P.  p.  . ..p.   be  a  path.   By  saying  "Processor  ?^ 

1   2     m 

sends  a  message  M  t£_  P.  ,  by  using  !l-,,r- we-mean  that:  P^  sends  a  signed 

message  to  P.   consisting  of  M  appended  to  the  path  name.   Then,   P^ 

relays  the  message  (after  signing  it)  to  P.  ,  etc.   If  P,-  ,  and  each  of 

i2  i 

the  processors  in  the  path  are  proper,  then  P.  will  receive  a  message 
of  the  form 


i  (a±       (...  a±   (OiCM.H  P,)  ...) 
m   m— 1       1         J 


(where  n  P  =  p  p.  ...p.  p.). 
J    xl  12    im  J 

Definition.   Let  P.  send  a  message  M  to  P^  by  using  II  .   Then 

origin(M)  =  P^^  ,    destination(M)  =  P- 
Def inition.   A  signed  message  S  of  the  form 

a.    (a,    (...o:  (a^M^  P,)...) 
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where  II  =  P.  ...p.   and  0  <  k  <  m,   arriving  to  processor  P^    ,   is 
1     m       —  k+1 

called  a  partial   signed  message  arriving  at  P,      M  is  called  the 
ik+1 

value  of  S. 

Definition.  A  processor  P.  detects  a  contradiction,  if  two  partial 
signed  messages  arrive  at  P.  (let  them  be  S,  S'),  with  the  following 
properties: 

(1)  origin(S)  =  Pk  ,  origin(S')  =  Pk -  (k,  k'  are  not  necessarily 
distinct) 

(2)  destination(S)  =  Pt  ,  destination(S' )  =  Pt-  ,  t~  f   t. 

(3)  value(S)  =  Oq(0),  value(S')  =  oQ(l) 

Note  that,  i  can  be  one  of  t  or  t~  or  one  of  k,k'. 

-lor-ter'?" 

Definition.  A  contaminated  ,  path,  is  a  path  containing  at  least  one 
faulty  processor.  g  -_  - 

3.   The  Algorithm  AP  for  processor  Pi  . 

Fix  a  constant  a  >  2.  (Note  that  a  can  be  set  arbitrarily  large  so 
as  to  arbitrarily  decrease  the  error  probability  of  our  algorithm.) 
Furthermore,  fix  a  constant  c  >  4.  Fix  m  =  a  log  n. 

Fix  g  =  c  log  n«na  loS(— \ 

1.  (Random  choice  phase).  For  each  j  in  {  1 , 2, . . . ,n)  -  { i}  ,  Pi  randomly 
chooses  a  set  R. .  of  g  paths  from  i  to  j,  each  of  length  m  =  a  log  n. 

2.  (Interrogation  phase).  Pj  "interrogates"  each  P^  (j  /  i)  about  M ■ , 
as  follows:  P.  sends  an  interrogation  (empty)  message  to  V.  through 
each  of  the  paths  in  R. .  .   (This  takes  m  rounds). 

Then,  if  P.  has  received  an  interrogation  message  from  Pk  through 
a  path  II  ,  P  sends  back  its  Hi    ,  by  using  the  path  IT   in  reverse.    ?i 


-7- 
does  this  answering  for  each  distinct  interrogation  message  and  each 
distinct   path  of   arrival.    (This   answering  process   takes  also  m 
rounds. ) 

Note  that  it  may  happen  that  some  answers  will  not  arrive  at  an 
interrogating  processor,  due  to  faulty  processors  not  relaying  the 
answer  in  the  return  path. 

3.  (Validation  phase).  If  P.  has  detected  a  contradiction  of  the  form 
{S,S',   origin(S)   =  P^  ,   origin(S')   =  Pk-  ,   destination(S)  =  Pt  , 

[II  3  S: 

destination(S')  =  P  -  ,  and  value(S)  =  Og(0),  value(S')  =  Oq(1)}  then 
P^  sends  the  partial  message  S'  along  each  of  the  g  paths  in  R^t 
(selected  in  Phase  1)  to  Pfc  ,  and  furthermore  in  this  case  P^  also 
sends  the  partial  message  S  along  each'  of.  the  paths  in  R^,.-  to  Pt~  . 
P^  does  this  only  once  for  each  processor  Pt  being  the  destination  of  a 
contradiction.  That  is,  if  P.  has  detected  more  than  one  contradiction 
involving  P  ,  only  one  is  used.  The  validation  phase  takes  m  = 
a  log  n  rounds. 

4.  (Decision  phase).  P.  decides  on  v  if  and  only  if  v  is  the  only 
value  arriving  at  it  in  Phases  2  and  3.  Else,  a  default  value 
corresponding  to  "sender  faulty"  is  produced. 


De 


f inition.   Let  R  be  the  union  of  R. .  for  all  P.  and  all  proper  P. 
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4.   Analysis  of  the  Algorithm 

Lemma  1.   Let  E^  be  the  event  that  for  all  II  in  R,  II  contains  at  least 
one  proper  processor.   Then  Prob{E,}  >  1  -  n~^  for  y  >  2  a  constant. 

Proof:  For  each  path  II  , 


prob(n  contains  no  proper  processor} 

<•  (£^a  log  n  <   na  log(t/n) 
—  n        _ 

<  n"a  log  10 


since 


Then 


t    1 

_  <  —   implies   log(t/n)  <  -  log  10 

n  —  10  — 


Prob{not   Ej}    <  n2g  n_a   loJ5    10 

<  n2  c  logn  n"a  lo§  9   (because  -2-  <±2-). 

I.e. 

Prob{not  E,}  <  n"Y  , 

where  Y  is  defined  to  be  equal  to 

i   n    *>         l°g  c    log  log  n  .    .   n    o  v  o 
a  log9  -  2  -  — 2 —  -  — 2 2 —  >  a  log9  -  3  >  2  . 

log  n     log  n   — 


Definition.   A  path  II  is   called   proper   if   all   its   processors   are 
proper. 

Lemma  2.   Let  E2   be   the   event  "For  each  proper  Pj  and  each  P-  ,  at 
least  one  element  of  R. .  is  proper".   then 
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Prob{E2}  >  1  -  n"(c_2)  . 

Proof:  The  probability  that  a  particular  random  path  in  R. .   is  proper 
is 

[l  -  ll  a  log  n  =  na  logO-t/n)^ 
n 

Let  q  be  the  probability  that  all  paths  of  R^.  are  not  proper.   Then 

q  =  (l  -  (1  -  I)a  lo§  n)g 
n 

r ,   "a  lo§(zzr)xxg  i  5'goi 

=  [  1  -  n       n-t  ) ) 

-a  log( — -)         •  ^  D( 
<_  e"§  n  (since  (1  -  l/x)x  <  e"  1  for  all  x  >  0), 

So 

q  <  n-c  , 

by  definition  of 

a  log( — -) 
g  =  c  log  n  •  n      n-t  t      c  >  4 

Hence 

Prob{not  E2)  <   n2q  <_  n~(c_2) 


So 


Prob{E2}  >  1  -  n(c  2)  . 
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Lemma  3.    Let   us  condition  the  algorithm's  execution  on  the  events  E, 
and  E2.   Then  the  algorithm  achieves  Byzantine  Agreement  at  the  end   of 
Phase  3. 

Proof: 

Case  1.  The  commander  PQ  is  proper.  Then  Vi ,  Mi  =  aQ(M)  is  the  same 
(since  M  is  the  same).  There  is  only  one  signed  value  sent  by  the 
commander,  hence  only  one  such  value  arrives  at  each  proper  P.  during 
phases  2  and  3.  (The  faulty  processors  cannot  alter  Oq(M),  or  devise  a 
false  value  for  it,  due  to  authentication.) 

Case  2.  Pq  is  faulty.  Assume  (for  the  sake  of  contradiction)  that  a 
proper  processor  P.  commits  to  value  v  (by  the  end  of  phase  3)  and 
another  proper  P.  commits  to  different  value  v'  e  {v,  "system  faulty"}. 
Then,  there  is  a  path  II  from  P.  to  some  processor  P^  ,  which  returns  v 
to  P..  By  Ej  there  is  a  proper  processor  Pg  on  II  ,  and  (by  E2)  Ps  is 
indeed  among  the  processors  interrogated  by  P.  for  the  sender's  value. 
Hence,  P  will  detect  a  contradiction  at  the  end  of  phase  2  between  the 
(answering)  message  sent  in  phase  2  with  origin  P  and  destination  P^ 
and  the  (answering)  message  sent  in  phase  2  with  origin  P,  and 
destination  P.  .  So,  P  will  notify  Pj^  about  the  value  0q(v)  through 
each  of  its  g  paths.  By  Ey  ,  at  least  one  of  these  paths  will  relay 
the  contradiction  information  to  P.  correctly.  Hence,  it  is  impossible 
for  P.^  to  commit  to  v.  Note  that  the  above  argument  works  even  if  i  = 
k.  • 

Lemma  4.   Our  algorithm  achieves  BA,  within  0(a  log  n)  rounds,  by  using 

2-K)(log(— ))  x 

n        n-t    messages,  and  with  probability  of  error  <_  n   ,  6  >  1. 


-11- 

Proof:  Our  algorithm  makes  an  error  if  E,  or  E2  do  not  hold.   Thus 

Prob(error)  <.  n"6  +  n^  _<  n^5  , 
where 

6  =  min(3  ,y  )  -  1  . 

Our  algorithms  has  three  phases  and  each  phase  takes  m  =  a  log  n 
rounds.  So,  the  total  number  of  rounds  is  3m  =  3a  log  n.  The  number 
of   signatures   per  message   is   0(a  liagisrDr,   so   the   total  number  of 


signatures  sent  is 


.,  .  (-V    c  - 


n2g»  (#  rounds)  •  (#  signatures /message) 

_<  n2g  O(a2log2n)  .  • 

j  a  log( — — )      9   9 

C  cn^log  n  n      n-U   •  O(a^log^n) 

2+0(log(Jl_)) 
=  n        n-t   . 

This  is  the  same  as  the  total  number  of  messages.  • 

l+0(log(— )) 
Lemma  5.   We  can  modify  our   algorithm  to  use   only  tn        n-t 

messages   and   signatures,   O(log  n)   rounds  with   the   same   success 

probability. 

Proof:  We  modify  our  algorithm  as  follows:  We  choose  arbitrarily  t+1 
processors  to  be  "relay"  processors.  We  require  any  nonrelay  processor 
to.  send  messages  only  to  relay  processors  (the  commander  cannot  be  a 
relay).  So,  if  processor  Pi  wants  to  send  a  message  to  processor  P.  , 
Pi  sends  the  (signed)  message  to  the  t+1  relay  processors.   They,  then, 


-12- 
send  it  to  P..   Clearly,  at  least  one  relay  processor  is  proper  and  the 
faulty  ones  cannot  alter  messages,  due  to  authentication.   (This  is  the 
same  technique,  as  used  in  [DS,  82a]). 

Acknowledgment :  We  wish  to  thank  R.  Cole  for  useful   comments   on   this 
paper. 
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